Cross-Lingual Experiments with Phone Recognition
نویسندگان
چکیده
This paper presents some of the recent research on speaker-independent continuous phone recognition for both French and English. The phone accuracy is assessed on the BREF corpus for French, and on the Wall Street Journal and TIMIT corpora for English. Cross-language differences concerning language properties are presented. It was found that French is easier to recognize at the phone level (the phone error for BREF is 23.6% vs. 30.1% for WSJ), but harder to recognize at the lexical level due to the larger number of homophones. Experiments with signal analysis indicate that a 4kHz signal bandwidth is sufficient for French, whereas 8kHz is needed for English. Phone recognition is a powerful technique for language, sex, and speaker identification. With 2s of speech, the languagecan be identified with better than 99% accuracy. Sex-identification for BREF and WSJ is errorfree. Speaker identification accuracies of 98.2% on TIMIT (462 speakers) and 99.1% on BREF (57 speakers), were obtained with one utterance per speaker, and 100% with 2 utterances.
منابع مشابه
Speech Recognition in 7 Languages
tic units arises from the pronunciation of the words in In this study we present approaches to multilinthe vocabulary, but when there is not sufficient trainIn tis tudywe resnt aproche to ultlin ing material available for the new language or when gual speech recognition. We first define different aptng uages are r the same tm the proaches, namely portation, cross-lingual and simultwo languages ...
متن کاملCross-lingual portability of MLP-based tandem features - a case study for English and Hungarian
One promising approach for building ASR systems for lessresourced languages is cross-lingual adaptation. Tandem ASR is particularly well suited to such adaptation, as it includes two cascaded modelling steps: feature extraction using multi-layer perceptrons (MLPs), followed by modelling using a standard HMM. The language-specific tuning can be performed by adjusting the HMM only, leaving the ML...
متن کاملMultilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model
Phoneme-based multilingual training and different crosslingual adaptation techniques for Automatic Speech Recognition (ASR) are explored in Connectionist Temporal Classification (CTC)-based systems. The multilingual model is trained to model a universal IPA-based phone set using CTC loss function. While the same IPA symbol may not correspond to acoustic similarity, Learning Hidden Unit Contribu...
متن کاملMandarin/English mixed-lingual name recognition for mobile phone
Speaker independent name speech recognition has become hot application in handheld devices such as mobile phones and personal digit assistants (PDAs). This paper presents a new mixed-lingual ASR system that will enable Chinese mobile phone users to conduct Mandarin and English name speech recognition simultaneously without switching language modes. We created an elaborately designed mixed acous...
متن کاملContext-sensitive probabilistic phone mapping model for cross-lingual speech recognition
This paper presents a probabilistic phone mapping model (PPM) that makes possible automatic speech recognition using a foreign phonetic system. We formulate the training of the phone mapping model in the framework of maximum likelihood estimation. The model can be learned automatically from the reference phonetic transcript and the phonetic transcript resulting from a foreign phonetic recognise...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007